Automated content scoring of spoken responses containing multiple parts with factual information

نویسندگان

Wenting Xiong

Keelan Evanini

Klaus Zechner

Lei Chen

چکیده

This paper presents approaches to automated content scoring of spoken language test responses from non-native speakers of English which contain multiple parts addressing factual information that the test taker has previously heard via auditory stimulus materials. While previous work relating to content scoring of spontaneous, unpredictable speech has focused only on entire responses and on general topic matching approaches, such as content vector analysis, the specific nature of spoken responses in our data requires response segmentation and extraction of features that indicate the relevance and correctness of the facts contained in the different parts of the response. Our best content features, based on similarity with key facts and concepts, achieve correlations of r = 0.615 (for speech recognition output) and r = 0.637 (using human transcriptions) with expert human rater scores. Furthermore, we show that these content features outperform traditional vector space based features. Finally, we demonstrate that the performance of a scoring model based on a combination of features developed previously and some of the newly designed content features improves significantly from r = 0.624 to r = 0.664 on an unseen evaluation set when using speech recognition output.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prompt-based Content Scoring for Automated Spoken Language Assessment

This paper investigates the use of promptbased content features for the automated assessment of spontaneous speech in a spoken language proficiency assessment. The results show that single highest performing promptbased content feature measures the number of unique lexical types that overlap with the listening materials and are not contained in either the reading materials or a sample response,...

متن کامل

Scoring Spoken Responses Based on Content Accuracy

Accuracy of content have not been fully utilized in the previous studies on automated speaking assessment. Compared to writing tests, responses in speaking tests are noisy (due to recognition errors), full of incomplete sentences, and short. To handle these challenges for doing content-scoring in speaking tests, we propose two new methods based on information extraction (IE) and machine learnin...

متن کامل

Modeling Discourse Coherence for the Automated Scoring of Spontaneous Spoken Responses

This study describes an approach for modeling the discourse coherence of spontaneous spoken responses in the context of automated assessment of non-native speech. Although the measurement of discourse coherence is typically a key metric in human scoring rubrics for assessments of spontaneous spoken language, little prior research has been done to assess a speaker’s coherence in the context of a...

متن کامل

Using an Ontology for Improved Automated Content Scoring of Spontaneous Non-Native Speech

This paper presents an exploration into automated content scoring of non-native spontaneous speech using ontology-based information to enhance a vector space approach. We use content vector analysis as a baseline and evaluate the correlations between human rater proficiency scores and two cosine-similarity-based features, previously used in the context of automated essay scoring. We use two ont...

متن کامل

Automated Content Scoring of Spoken Responses in an Assessment for Teachers of English

This paper presents and evaluates approaches to automatically score the content correctness of spoken responses in a new language test for teachers of English as a foreign language who are non-native speakers of English. Most existing tests of English spoken proficiency elicit responses that are either very constrained (e.g., reading a passage aloud) or are of a predominantly spontaneous nature...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Automated content scoring of spoken responses containing multiple parts with factual information

نویسندگان

چکیده

منابع مشابه

Prompt-based Content Scoring for Automated Spoken Language Assessment

Scoring Spoken Responses Based on Content Accuracy

Modeling Discourse Coherence for the Automated Scoring of Spontaneous Spoken Responses

Using an Ontology for Improved Automated Content Scoring of Spontaneous Non-Native Speech

Automated Content Scoring of Spoken Responses in an Assessment for Teachers of English

عنوان ژورنال:

اشتراک گذاری